Adaptive Linear Quadratic Control Using Policy Iteration
نویسندگان
چکیده
In this paper we present stability and convergence results for Dynamic Programming-based reinforcement learning applied to Linear Quadratic Regulation (LQR). The spe-ciic algorithm we analyze is based on Q-learning and it is proven to converge to the optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. The performance of the algorithm is illustrated by applying it to a model of a exible beam.
منابع مشابه
Adaptive linear quadratic control using policy iteration - American Control Conference, 1994
In this paper we present stability and convergence results for Dynamic Programming-based reinforcement learning applied to Linear Quadratic Regulation (LQR). The specific algorithm we analyze is based on Q-learning and it is proven to converge to the optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. This is the first c...
متن کاملOptimization of Markov jump linear system with controlled jump probabilities of modes
The optimal control p roblem of Markov jump linear quadratic model with controlled jump probabilities of modes is investigated. Two kinds of mode control policies , open2loop control policy and close2loop control policy , are considered. By using policy iteration and performance potential concept , a sufficient condition for the optimal close2 loop control policy being better than the optimal o...
متن کاملGreedy Adaptive Critics for LQR Problems: Convergence Proofs
A number of success stories have been told where reinforcement learning has been applied to problems in continuous state spaces using neural nets or other sorts of function approximators in the adaptive critics. However, the theoretical understanding of why and when these algorithms work is inadequate. This is clearly exempliied by the lack of convergence results for a number of important situa...
متن کاملOptimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics
In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information. Detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. The introduced reinforcement learning-based algorithms learn online the approximate solution...
متن کاملFinite-horizon near optimal adaptive control of uncertain linear discrete-time systems
In this paper, the finite-horizon near optimal adaptive regulation of linear discrete-time systems with unknown system dynamics is presented in a forward-in-time manner by using adaptive dynamic programming and Q-learning. An adaptive estimator (AE) is introduced to relax the requirement of system dynamics, and it is tuned by using Q-learning. The time-varying solution to the Bellman equation i...
متن کامل